AITopics

Genre: Research Report (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.31)

arXiv.org Artificial IntelligenceOct-23-2025

Hierarchical DLO Routing with Reinforcement Learning and In-Context Vision-language Models

Li, Mingen, Yu, Houjian, Huang, Yixuan, Hong, Youngjin, Choi, Changhyun

Abstract-- Long-horizon routing tasks of deformable linear objects (DLOs), such as cables and ropes, are common in industrial assembly lines and everyday life. These tasks are particularly challenging because they require robots to manipulate DLO with long-horizon planning and reliable skill execution. Successfully completing such tasks demands adapting to their nonlinear dynamics, decomposing abstract routing goals, and generating multi-step plans composed of multiple skills, all of which require accurate high-level reasoning during execution. In this paper, we propose a fully autonomous hierarchical framework for solving challenging DLO routing tasks. Given an implicit or explicit routing goal expressed in language, our framework leverages vision-language models (VLMs) for in-context high-level reasoning to synthesize feasible plans, which are then executed by low-level skills trained via reinforcement learning. T o improve robustness in long horizons, we further introduce a failure recovery mechanism that reorients the DLO into insertion-feasible states. Our approach generalizes to diverse scenes involving object attributes, spatial descriptions, as well as implicit language commands. It outperforms the next best baseline method by nearly 50% and achieves an overall success rate of 92.5% across long-horizon routing scenarios. Please refer to our project page: https:// icra2026-dloroute.github.io/DLORoute/

machine learning, natural language, reinforcement learning, (19 more...)

2510.19268

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Neural Information Processing SystemsOct-3-2025, 02:47:42 GMT

Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards

Siyuan Li, Rui Wang, Minxue Tang, Chongjie Zhang

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Artificial IntelligenceSep-19-2025

Mastering Multi-Drone Volleyball through Hierarchical Co-Self-Play Reinforcement Learning

Zhang, Ruize, Xiang, Sirui, Xu, Zelai, Gao, Feng, Ji, Shilong, Tang, Wenhao, Ding, Wenbo, Yu, Chao, Wang, Yu

Competitive tasks have long served as benchmarks for progress in artificial intelligence. Landmark results have been achieved in domains such as Go [1], poker [2], and real-time strategy games [3], where agents learn to plan, adapt, and compete under structured rules. As research moves from virtual environments to the physical world, robot sports-structured, rule-based competitions involving physical agents-have emerged as a promising frontier for embodied intelligence. Examples include robot soccer [4, 5], table tennis [6, 7], and multi-drone pursuit-evasion [8], which combine high-level strategy with low-level motion control in physically grounded settings. In this paper, we tackle a new embodied competitive task proposed by the V olleyBots testbed [9]: 3v3 multi-drone volleyball. This task exemplifies the structure of a robot sport-well-defined objectives, explicit rules, and head-to-head competition-while presenting a set of unique and underex-plored challenges. Each team must coordinate three quadrotors to rally a ball over a net, switching roles dynamically between offense and defense in a turn-based fashion. The environment is highly dynamic and demands precise timing, agile 3D maneuvering, and strategic team-level behavior. The turn-based nature of ball exchange introduces long-horizon temporal dependencies; the multi-agent setting requires tightly coupled tactics; and the underactuated dynamics of quadrotors call for fine-grained, reactive motor skills.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

2505.04317

Country: Asia (0.28)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports > Soccer (0.34)
Leisure & Entertainment > Games > Computer Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

arXiv.org Artificial IntelligenceSep-3-2025

Toward Real-World Cooperative and Competitive Soccer with Quadrupedal Robot Teams

Su, Zhi, Gao, Yuman, Lukas, Emily, Li, Yunfei, Cai, Jiaze, Tulbah, Faris, Gao, Fei, Yu, Chao, Li, Zhongyu, Wu, Yi, Sreenath, Koushil

Achieving coordinated teamwork among legged robots requires both fine-grained locomotion control and long-horizon strategic decision-making. Robot soccer offers a compelling testbed for this challenge, combining dynamic, competitive, and multi-agent interactions. In this work, we present a hierarchical multi-agent reinforcement learning (MARL) framework that enables fully autonomous and decentralized quadruped robot soccer. First, a set of highly dynamic low-level skills is trained for legged locomotion and ball manipulation, such as walking, dribbling, and kicking. On top of these, a high-level strategic planning policy is trained with Multi-Agent Proximal Policy Optimization (MAPPO) via Fictitious Self-Play (FSP). This learning framework allows agents to adapt to diverse opponent strategies and gives rise to sophisticated team behaviors, including coordinated passing, interception, and dynamic role allocation. With an extensive ablation study, the proposed learning method shows significant advantages in the cooperative and competitive multi-agent soccer game. We deploy the learned policies to real quadruped robots relying solely on onboard proprioception and decentralized localization, with the resulting system supporting autonomous robot-robot and robot-human soccer matches on indoor and outdoor soccer courts.

artificial intelligence, machine learning, robot, (14 more...)

2505.13834

Country: Asia (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Locomotion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Neural Information Processing SystemsAug-14-2025, 19:03:24 GMT

60106888f8977b71e1f15db7bc9a88d1-Paper.pdf

international conference, learning, reinforcement learning, (16 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Kobashi, Keita, Tomizuka, Masayoshi

Leveraging Extrinsic Dexterity for Occluded Grasping on Grasp Constraining Walls

arXiv.org Artificial IntelligenceJul-22-2025

This study addresses the problem of occluded grasping, where primary grasp configurations of an object are not available due to occlusion with environment. Simple parallel grippers often struggle with such tasks due to limited dexterity and actuation constraints. Prior works have explored object pose reorientation such as pivoting by utilizing extrinsic contacts between an object and an environment feature like a wall, to make the object graspable. However, such works often assume the presence of a short wall, and this assumption may not always hold in real-world scenarios. If the wall available for interaction is too large or too tall, the robot may still fail to grasp the object even after pivoting, and the robot must combine different types of actions to grasp. To address this, we propose a hierarchical reinforcement learning (RL) framework. We use Q-learning to train a high-level policy that selects the type of action expected to yield the highest reward. The selected low-level skill then samples a specific robot action in continuous space. To guide the robot to an appropriate location for executing the selected action, we adopt a Conditional Variational Autoencoder (CVAE). We condition the CVAE on the object point cloud and the skill ID, enabling it to infer a suitable location based on the object geometry and the selected skill. To promote generalization, we apply domain randomization during the training of low-level skills. The RL policy is trained entirely in simulation with a box-like object and deployed to six objects in real world. We conduct experiments to evaluate our method and demonstrate both its generalizability and robust sim-to-real transfer performance with promising success rates.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2507.14721

Country: North America > United States > California (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)

arXiv.org Artificial IntelligenceJul-10-2025

Multi-Task Multi-Agent Reinforcement Learning via Skill Graphs

Zhu, Guobin, Zhou, Rui, Ji, Wenkang, Zhang, Hongyin, Wang, Donglin, Zhao, Shiyu

Multi-task multi-agent reinforcement learning (MT-MARL) has recently gained attention for its potential to enhance MARL's adaptability across multiple tasks. However, it is challenging for existing multi-task learning methods to handle complex problems, as they are unable to handle unrelated tasks and possess limited knowledge transfer capabilities. In this paper, we propose a hierarchical approach that efficiently addresses these challenges. The high-level module utilizes a skill graph, while the low-level module employs a standard MARL algorithm. Our approach offers two contributions. First, we consider the MT-MARL problem in the context of unrelated tasks, expanding the scope of MTRL. Second, the skill graph is used as the upper layer of the standard hierarchical approach, with training independent of the lower layer, effectively handling unrelated tasks and enhancing knowledge transfer capabilities. Extensive experiments are conducted to validate these advantages and demonstrate that the proposed method outperforms the latest hierarchical MAPPO algorithms. Videos and code are available at https://github.com/WindyLab/MT-MARL-SG

machine learning, reinforcement learning, skill graph, (13 more...)

2507.0669

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceApr-22-2025

Dynamic Legged Ball Manipulation on Rugged Terrains with Hierarchical Reinforcement Learning

Zhu, Dongjie, Yang, Zhuo, Wu, Tianhang, Ge, Luzhou, Li, Xuesong, Liu, Qi, Li, Xiang

Advancing the dynamic loco-manipulation capabilities of quadruped robots in complex terrains is crucial for performing diverse tasks. Specifically, dynamic ball manipulation in rugged environments presents two key challenges. The first is coordinating distinct motion modalities to integrate terrain traversal and ball control seamlessly. The second is overcoming sparse rewards in end-to-end deep reinforcement learning, which impedes efficient policy convergence. To address these challenges, we propose a hierarchical reinforcement learning framework. A high-level policy, informed by proprioceptive data and ball position, adaptively switches between pre-trained low-level skills such as ball dribbling and rough terrain navigation. We further propose Dynamic Skill-Focused Policy Optimization to suppress gradients from inactive skills and enhance critical skill learning. Both simulation and real-world experiments validate that our methods outperform baseline approaches in dynamic ball manipulation across rugged terrains, highlighting its effectiveness in challenging environments. Videos are on our website: dribble-hrl.github.io.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2504.14989

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Sports (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Neural Information Processing SystemsJan-25-2025, 04:35:27 GMT

Reviews: Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards

This is an interesting approach and seems novel in the context of options, although it looks to have some similarities to potential based reward shaping, e.g. (Devlin and Kudenko, 2012). The main advantages claimed for HAAR are (loosely) those of improved performance under sparse rewards and the learning of skills appropriate for transfer. These claims could be made more explicit, and that might help to justify the experimental section. The authors define advantage as: A_h(s_t h,a_t h) E[r_t h \gamma_h V_h(s_{t k} h) - V_h(s_{t} h)] The meaning of this is a little ambiguous and I would prefer this to be clarified.

advantage-based auxiliary reward, hierarchical reinforcement learning, representation, (13 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)